Classifier evaluation metrics

Read the data


In [1]:
%matplotlib inline

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
Data=pd.read_csv ('DataExample.csv')
Data.head()


Out[1]:
ClassEdemaPre ClassEdemaPost ClassTissueFlair ClassEdemaFlair ClassTumorFlair ClassTissuePost ClassTumorPost ClassTissuePre ClassTumorPre
0 1.153593 1.043390 0.568602 1.244563 1.263765 0.269377 1.771372 0.832707 1.200128
1 1.207222 1.111777 0.059813 1.267483 2.170994 0.025167 1.607958 0.119511 1.210576
2 1.129254 0.924474 0.046106 1.318608 1.839030 0.049453 1.709299 0.045267 1.245954
3 1.015360 1.037350 0.303170 1.420815 1.416610 0.906734 2.225173 0.999662 1.128684
4 1.030810 0.916476 1.064397 1.742959 0.854583 0.930869 1.864301 0.923693 0.869097

In [2]:
ClassBrainTissuepost=(Data['ClassTissuePost'].values)
ClassBrainTissuepost= (np.asarray(ClassBrainTissuepost))
ClassBrainTissuepost=ClassBrainTissuepost[~np.isnan(ClassBrainTissuepost)]
ClassBrainTissuepre=(Data[['ClassTissuePre']].values)
ClassBrainTissuepre= (np.asarray(ClassBrainTissuepre))
ClassBrainTissuepre=ClassBrainTissuepre[~np.isnan(ClassBrainTissuepre)]
ClassTUMORpost=(Data[['ClassTumorPost']].values)
ClassTUMORpost= (np.asarray(ClassTUMORpost))
ClassTUMORpost=ClassTUMORpost[~np.isnan(ClassTUMORpost)]
ClassTUMORpre=(Data[['ClassTumorPre']].values)
ClassTUMORpre= (np.asarray(ClassTUMORpre))
ClassTUMORpre=ClassTUMORpre[~np.isnan(ClassTUMORpre)]
X_1 = np.stack((ClassBrainTissuepost,ClassBrainTissuepre)) # we only take the first two features.
X_2 = np.stack((ClassTUMORpost,ClassTUMORpre))
X=np.concatenate((X_1.transpose(), X_2.transpose()),axis=0)
y =np.zeros((np.shape(X))[0])
y[np.shape(X_1)[1]:]=1

Split training testing


In [3]:
# split X and y into training and testing sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

train a logistic regression model on the training set


In [4]:
from sklearn.linear_model import LogisticRegression
logreg = LogisticRegression()
logreg.fit(X_train, y_train)


Out[4]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)

make class predictions for the testing set


In [5]:
y_pred_class = logreg.predict(X_test)

calculate accuracy


In [6]:
# Classification accuracy: percentage of correct predictions

from sklearn import metrics
print(metrics.accuracy_score(y_test, y_pred_class))


0.9364

Confusion matrix


In [7]:
print(metrics.confusion_matrix(y_test, y_pred_class))


[[1188   75]
 [  84 1153]]

In [8]:
# save confusion matrix and slice into four pieces
confusion = metrics.confusion_matrix(y_test, y_pred_class)
TP = confusion[1, 1]
TN = confusion[0, 0]
FP = confusion[0, 1]
FN = confusion[1, 0]

Classification Accuracy

Overall, how often is the classifier correct


In [9]:
print((TP + TN) / float(TP + TN + FP + FN))
print(metrics.accuracy_score(y_test, y_pred_class))


0.9364
0.9364

Classification Error

Overall, how often is the classifier incorrect?


In [10]:
print((FP + FN) / float(TP + TN + FP + FN))
print(1 - metrics.accuracy_score(y_test, y_pred_class))


0.0636
0.0636

Sensitivity

When the actual value is positive, how often is the prediction correct?


In [11]:
print(TP / float(TP + FN))
print(metrics.recall_score(y_test, y_pred_class))


0.932093775263
0.932093775263

Specificity

When the actual value is negative, how often is the prediction correct?


In [12]:
print(TN / float(TN + FP))


0.940617577197

False Positive Rate

When the actual value is negative, how often is the prediction incorrect?


In [13]:
print(FP / float(TN + FP))


0.0593824228029

Precision

When a positive value is predicted, how often is the prediction correct?


In [14]:
print(TP / float(TP + FP))
print(metrics.precision_score(y_test, y_pred_class))


0.938925081433
0.938925081433

ROC Curves and Area Under the Curve (AUC)


In [15]:
# IMPORTANT: first argument is true values, second argument is predicted probabilities
fpr, tpr, thresholds = metrics.roc_curve(y_test, y_pred_class)
plt.plot(fpr, tpr)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.0])
plt.title('ROC curve for diabetes classifier')
plt.xlabel('False Positive Rate (1 - Specificity)')
plt.ylabel('True Positive Rate (Sensitivity)')
plt.grid(True)



In [16]:
# define a function that accepts a threshold and prints sensitivity and specificity
def evaluate_threshold(threshold):
    print('Sensitivity:', tpr[thresholds > threshold][-1])
    print('Specificity:', 1 - fpr[thresholds > threshold][-1])
    
evaluate_threshold(0.5)


('Sensitivity:', 0.9320937752627324)
('Specificity:', 0.94061757719714967)

AUC

The percentage of the ROC plot that is underneath the curve:


In [17]:
print(metrics.roc_auc_score(y_test, y_pred_class))


0.93635567623

calculate cross-validated AUC


In [18]:
from sklearn.cross_validation import cross_val_score
cross_val_score(logreg, X, y, cv=10, scoring='roc_auc').mean()


/Users/m112447/Documents/PythonEnv/skynet/lib/python2.7/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)
Out[18]:
0.97659079999999998

In [ ]:


In [ ]: